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SYSTEM AND METHOD FOR FRAME PACKING 
BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates generally to the field of packetized voice transmission, and 
more particularly, to a system and method for packing frames of voice packets. 

2. Description of the Related Art 

A typical communications network 10 is shown in Figure 1. Individual computers 12, 14, 
16, 18, 20 are connected via a network interface 30, 32, 34 to a network, such as the Internet 40. 
The network 10 forwards and routes data sent from a source computer to a destination computer. 
Increasingly, such networks are also used to transmit voice signals, as well as data. In fact, 
digital telephones 50 may be connected directly to the network 10. However, since such 
networks were originally designed to transmit data, and not necessarily in real-time, there are 
many problems associated with transmitting voice conversations between two parties. 

In packetized voice communication systems (e.g. Voice-over-IP (VoIP), frame relay, or 
ATM), voice data is digitized and lossily compressed into frames. Each frame represents the 
voice data for a small unit of time, typically 30 milliseconds. Frames are then transported over 
the network from a source to a destination, where the frames are then decompressed. 

In a Voice-over-IP (VoIP) network, each frame of voice data is typically encapsulated in 
one datagram. The Internet Protocol (IP) imposes a minimum of 20 bytes of header, containing 
such information as the destination IP address. The UsefJDatagram Protocol .(UDP), typically 
used for voice transport applications, adds another six bytes of header information. A voice 
frame encoded with, for example, the Lucent 9600 codec is 18 bytes long. (Figure 2). 

This results in a total packet length of 44 bytes, of which 26 bytes are overhead. Since 
60% of the bandwidth is effectively wasted by the large amount of overhead the Internet 
Protocols impose on the packets, this greatly reduces the number of calls that can be supported at 
one time. 

One prior art solution has been to group multiple frames from a call together in a single 
packet in order to increase the amount of payload, for a given header length. For example, 
equipment manufactured by Nuera can be configured to, instead of transmitting each voice frame 
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in its own packet, group several consecutive frames together from a single voice stream and 
transmit those frames together in a single datagram. 

This approach, however, has a significant disadvantage. If the equipment is configured to 
group five frames together before transmitting one packet, the latency end-to-end is increased 
5 fivefold - in this case to 1 50 ms. Since each packet contains five 30 ms frames, a packet can only 
be transmitted every 150 ms. Grouping smaller numbers of frames together will reduce the 
latency, but again increase the overhead. In Figure 3, the heavy black lines indicate which frames 
have been packed together into a single datagram. Figure 4 is a table showing the tradeoff 
between latency and overhead. As illustrated in the table, as the number of frames per packet is 
1 0 increased, the overhead is reduced, but the latency is increased. 

A further disadvantage of this method is that if one packet is lost due to network 
congestion or other network problems, a noticeable click, pop or dropout will occur on the line. 
(The human ear is generally incapable of noticing gaps shorter than approximately 100 ms as 
anything other than a click.) 
1 5 Another solution simply lengthens the time "window" for each frame, from say 30 ms to 

50 ms. This solution is not satisfactory since the latency is also increased. It would thus be 
desirable to have an improved system and method for transmitting packetized voice frames. 

SUMMARY OF THE INVENTION 

20 In general, the present invention is a system and method for packing frames of voice 

packets. Instead of packing sequential frames from the same call, frames from different calls at a 
same time interval are packed together into a single packet (datagram). This increases the 
effective data payload for each packet, without increasing the transmission latency for each call. 

25 BRIEF DESCRIPTION OF THE DRAWINGS - 

The present invention will be readily understood by the following detailed description in 
conjunction with the accompanying drawings, wherein like reference numerals designate like 
structural elements, and in which: 

Figure 1 illustrates a typical Voice-over-IP network; 
30 Figure 2 illustrates a typical IP voice packet; 

Figure 3 illustrates a prior art frame packing technique; 

Figure 4 is table illustrating the relationship between overhead and latency for a prior art 
frame packing approach; 
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Figure 5 il" .istrates the frame packing technique of the present invention; 
Figure 6 is a flowchart of a voice transmission process incorporating one embodiment of 

the present invention; and 

Figure 7 is a block diagram of a network interface incorporating the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 

The following description is provided to enable any person skilled in the art to make and 
use the invention and sets forth the best modes contemplated by the inventor for carrying out the 
invention. Various modifications, however, will remain readily apparent to those skilled in the 
art, since the basic principles of the present invention have been defined herein specifically to 
provide a system and method for packing frames of voice packets. Any and all such 
modifications, equivalents and alternatives are intended to fall within the spirit and scope of the 
present invention. 

In general, the present invention combines time slices from the same moment in time 
from different calls into a single packet, improving efficiency and reducing the latency associated 
with prior art approaches. The present approach is illustrated in Figure 5. The frames from 
several different calls, representing the same 30 ms slice of time, are packed together into a 
single datagram. This approach is in contrast to the packing of several consecutive 30 ms slices 
for one call, as shown in Fig. 3. The present invention thus provides the same reduction of 
overhead as the prior art techniques, but substantially reduces or eliminates the associated 
disadvantages of latency and the effects of packet loss. 

Figure 6 is a flowchart of a voice transmission process incorporating one embodiment of 
the present invention. First, at step 1, the voice signals for each call are digitized into frames. 
Then at step 2 the frames are processed to remove line noise, echo, etc. and are compressed using 
a standard compression scheme, as is well known in the art. Frames from each call 
corresponding to a same time interval are then selected (step 3) and combined into a single 
packet (step 4). The packet is then transmitted to a destination Via a packet-switched IP network, 
such as the Internet, at step 5. When the packet arrives at a destination, the packet is parsed 
(separated") back into the separate frames at step 6. Each frame is routed to its associated call 
(voice interface connection) at step 7. The frames for each call are processed and decompressed 
at step 8. Finally, the digital frames are converted to an analog voice signal. 

The present method of frame packing can be significantly extended to a large number of 
calls, since there is no sacrifice in latency in doing so. When the number of packets per frame is 
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increased on the order of hundreds, up to the maximum of the transport layer (in the case of 
Ethernet, around 1500 bytes), a large number of calls may be packed into a single packet. 
Depending upon the CODEC, approximately 160 calls may be packed into a single packet. This 
reduces the overhead required on network interfaces, routers and switches, since most of these 
devices 1 switching ability is governed more by their number of packets per second, rather than the 
number of bytes transmitted. Thus, the present invention also reduces processing delay in the 
sending and receiving nodes, since a much fewer number of packets needs to be transmitted and 
received. 

Figure 7 is a block diagram of a network interface 70 incorporating the present invention. 
A network I/O module 7 1 connects the interface 70 to an external network. The interface 70 
includes a processor 72, an I/O bus 73, and a memory 74 for processing the call packets. The 
network interface 70 has an internal operating system 75 to control the internal operation of the 
interface. An internal system ROM 76 stores the operative control code for the interface, 
including software code for performing the frame packing logic of the present invention. The 
network interface further includes telephony hardware interface 78 for connecting to telephone 
equipment, and optionally includes a DSP voice compression module 79 and a Forward Error 
Correction (FEC) module 77 for providing error correction services. 

Note that since each packet contains multiple frames from a same time interval, the 
system latency is greatly reduced, as compared to the prior art approaches. While the present 
invention has been described herein specifically with reference to a Voice-over-IP network, the 
present invention may be advantageously applied to any network that utilizes packets to transmit 
frames of data, in order to reduce header overhead and transmission latency. 

Those skilled in the art will appreciate that various adaptations and modifications of the 
just-described preferred embodiments can be configured without departing from the scope and 
spirit of the invention. Therefore, it is to be understood that, within the scope* of the appended 
claims, the invention may be practiced other than as specifically described herein. 
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1 . A method for packing frames for transmission in a network, the method comprising: 
selecting frames from a same time interval from different connections; 

5 combining the frames from the same time interval into a single packet; and 

transmitting ther packet over the network. 

2. The method of Claim 1 , further comprising parsing the packet into separate frames at a 
receiving end. 

10 

3. The method of Claim 2, further comprising routing each frame to an associated voice 
interface connection. 

4. The method of Claim 3, wherein prior to selecting frames, the method further 
1 5 comprises: 

digitizing a voice signal for each connection into frames; and 
processing the frames. 

5. The method of Claim 4, further comprising: 

20 processing each routed frame at the receiving end; and 

converting the frames to a voice signal for each vo ; ce interface connection. 

6. The method of Claim 5, wherein the network is a network that supports Internet 
Protocol (IP). 

25 
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7. A communications system comprising: 

at least two source communication nodes and two destination communication nodes; 
a network supporting an Internet Protocol (IP); and 

a network interface connected between the source communication nodes and the network; 
5 a network interface connected between the destination communication nodes and the 

network; 

wherein a network interface combines frames from different sources at a same time 
interval into a single packet and transmits the packet over the network. 

10 8. The communications system of Claim 7, wherein a network interface at a destination 

separates the frames in the packet and routes each frame to an appropriate voice interface 
connection. 

9. A method for transmitting voice frames in a packet-switching network, the method 
1 5 comprising: 

digitizing a voice signal for each call into frames; 
processing the frames; 

selecting frames from a same time interval from different calls; 
combining the frames from the same time interval into a single packet; 
20 transmitting the packet over the network; 

separating the packet into separate frames at a receiving end; 
routing each frame to an associated voice interface; 
processing each routed frame at the receiving end; and 
converting the frames to a voice signal for each call. 

25 

10. The method of Claim 9, wherein the packets are Internet Protocol (IP) packets. 
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1 1. A network interface comprising: 

a processor; 

a memory connected to the processor; and 
a system ROM; 

5 wherein the system ROM stores execution code for the processor, the execution code 

comprising: 

execution code for selecting frames from a same time interval from different connections; 
execution code for combining the frames from the same time interval into a single packet; 

and 

] o execution code for transmitting the packet over the network. 
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